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PRELIMINARY RESULTS FROM THE ANALYSIS OF 



WIND COMPONENT ERROR 
by 

P. A. Jacobs and D. P. Gaver 



0. INTRODUCTION 

Numerical meteorological models are used to assist in the prediction of 
weather. Each run of a numerical model produces forecasts of 
meteorological variables which are used as preliminary predictions of the 
future values of these variables. These initial predictions are referred to as 
first-guess values. In this paper first-guess values will refer to the most 
recent 12 hour forecasts. 

In certain areas of the world observations of the values of forecasted 
variables become available, in our case the observations become- available 12 
hours after the first-guess values are computed. Prior to the next run of the 
numerical model a multivariate optimal interpolation analysis updates a 
first-guess value of a variable by adding to it a weighted observed value of 
the variable if it is available. The weight multiplying the observed value 
depends on estimates of the squared error of the first-guess value and the 
squared error of the observation; cf. Goerss et al. [1991, a, b]. Thus it is of 
importance to predict such first-guess squared errors. 

The general problem of modeling and predicting mean square errors is 
important but not widely studied; see Efron (1986) and Jorgenson (1987). In 
the next section statistical models for the error of the first-guess are 
introduced. The models assume the error of the first-guess has mean 0 but has 
a scale parameter that is log-linear with suitable covariates, i.e. explanatory or 
regression variables. 
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Results are reported concerning the estimation of model parameters, and 
model cross-validation and predictive ability for u, v wind component data 
from the months of February and April 1991. The data consist of 
measurements and 12 hour forecasts (first-guess values) from 93 stations in 
North America, 25N-75N. The forecasts are produced using the NOGAPS 
Spectral Forecast Model; cf. Hogan et al. Each station has measurement and 
first-guess values for every 12 hours; there are some missing observations. 
The first-guess values are subtracted from measurement values (if available) 
to obtain observations of the error of the first-guess. The results appear in 
Sections 3 and 4 and in Appendices B, C and D. 

The results indicate that estimates of the variance of the error of first-guess 
wind components can be improved by using covariates which are functions 
of the wind components. Covariates using observed values of the wind 
components appear to have more predictive ability than those using first- 
guess values. Further exploratory work is needed to determine the degree 
with which these statistical results can be used to improve the forecasting 
ability of the numerical model. 

1. THE MODELS 
Let 

Uo(0 = observed w-wind component at time t 
Uf(t) = first-guess w-wind component at time t 
Vo(0 = observed y-wind component at time t 
Vf(t) = first-guess y-wind component at time t 

\ 

r(t) - [(U 0 (f) • U 0 (l ■ l)) 2 + (Vo(() - V 0 (l - 1)) 2 ]2 

1 

s(f)-[U 0 (f) 2 + V' 0 (f) 2 ] 2 

no - u 0 (t) ■ Uffl) or Y(f) - V 0 (l) ■ V f (t) 

The models considered are as follows: 
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NORMAL MODELS: 



One Variable Models 

1. (Y(f)} are independent normally distributed random variables with 
mean 0 and variance 

o? (i;0 = ex p{«i(i)+/5i(iM0}- (!) 

2. (Y(f)} are independent normally distributed random variables with 
mean 0 and variance 

(2; 0 = expjtt! (2) + ft (2)s(f )}. (2) 

Two Variable Model 

3. { V (f)} are independent normally distributed random variables with 
mean 0 and variance 

ai{t) = exp{a + p l r{t) + p 2 4 t )}- , . ( 3 ) 

t T 

CAUCHY MODELS: 

While many measurement errors of physical quantities are 
approximately normal, especially "in the middle" of their distribution, there 
can well be thicker-than-normal tails and occasional extreme outliers. These 
attributes can have seriously degrading effects in regression-like problems; cf. 
Mosteller and Tukey (1977), Huber (1981) and Hampel (1986). The Cauchy 
distribution is a symmetric distribution with thicker tails than those of the 
normal distribution. Distributions with long straggling tails have the 
tendency to produce outlying values. The following models use the Cauchy 
distribution to represent and suitably compensate for more-thick-tailed 
measurement error than that of the Normal distribution. 
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One Variable Models 



4. {Y (f)} are independent Cauchy random variables with scale parameter 

cri 2 (l;f) = exp{oq (1) + fa (l)r(f )}. (4) 

5. (Y(f)} are independent Cauchy random variables with scale parameter 

ai(2;t) = exp{a 1 (2) + /3 1 (2)s(f)}. (5) 



Two Variable Model 

6. {Y(0) are independent Cauchy random variables with scale parameter 

o%(t) = exp{cc + p- l r{t) + p 2 s{t)}. (6) 



The form of the Cauchy density function with scale parameter o that is 
used is 



m-iz 

no 



1 + 



cr 2 



-l 



for 



<y < 



2. ESTIMATION OF PARAMETERS 

For both normal and Cauchy models, the model parameters are estimated 
by maximum likelihood. A system of equations is obtained by setting the first 
partial derivative with respect to each parameter of the In likelihood function 
equal to zero. The system of equations is solved numerically using Newton's 
method to obtain the maximum likelihood estimates. The procedure for the 
normal models is given in Appendix A. 
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3. THE DATA ANALYSIS— FEBRUARY DATA 
3.1 Observed Wind Covariate Models 

In this subsection we report an assessment of the goodness of fit and 
cross-validation for the normal models (1 )— (3) using observational wind 
components as covariates. There are six analyses; one for the w-wind 
component (respectively z>-wind component) for each pressure level height. 
Each analysis proceeds along the same lines. In what follows by data we mean 
triples (y(f), r(0, s(t)). 

In each analysis the data are randomly divided into two sets called DA 
and DB without regard to the values of the data. 

The maximum likelihood parameter estimates for each model (1)— (3) are 

obtained for each set DA and DB and for all the data. The estimated values 

2 2 2 

appear in Table 1. The estimated variances cr^l,/), cr 1 (2,0, cr 2 (f), are computed 
for the parameters estimated from DA and DB using (1)— (3)' for each data 
point in DA and DB. 

The models are for the variances of the observations rather than the 
observations themselves. One possible procedure to assess goodness-of-fit 
and cross-validate the models is by binning the data. To assess models (1) and 
(3) the data ( y(t ), r(t), s(t )) are binned into 10 bins based on ordering the values 
of r(t) from smallest to largest. The data in the first bin correspond to the 
smallest values of r(f); the data in the 10th bin correspond to the largest 
values of r(t). Each bin contains about of the data with the 10th bin 
containing a few more data. The averages of the estimated variances for 
models (1) and (3) are computed for each bin. The average y(t) 2 is also 
computed for each bin. 
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To assess models (2) and (3) the same procedure is used but the binning is 
based on the values of s(t). 

Figures 1-24 present graphs of the In [average y(f) 2 ] in each bin versus In 
[average estimated variance] in each bin for models (1) and (3) and models (2) 
and (3). Figures 1, 5, 9, 13, 17, 21 (respectively 2, 6, 10, 14, 18, 22) show the 
logarithm of the average of the y(t) 2 values of DA (respectively DB) versus the 
logarithm of the average of the estimated variances for each bin using the 
estimated parameters from DA (respectively DB). If a model were perfect, a 
point should be close to the 45° line shown. 

Figures 3, 7, 11, 15, 19, 23, (respectively 4, 8, 12, 16, 20, 24) present graphs of 
In average y(f) 2 of DA (respectively DB) versus In average estimated variances 
using parameters estimated using data DB (respectively DA). Once again if 
the model were perfect, the points would be close to the 45° line. 

Since the two-variate model (3) is shown with both one-variate models, it 
is possible to obtain some idea of the effect of the two different sets of bins on 
the In averages. In particular, the graphs corresponding to the 500 Mb height 
winds. Figures 9-16, show that the display of In averages can be quite sensitive 
to which variate is used to do the binning. 

Keeping this binning sensitivity in mind, the figures suggest the 
following concerning the models using observed winds as covariates. It 
appears that of the two one-variate models, model (1) which uses r(f) as the 
covariate is the better. The two-variate model (3) appears not much better 
than model (1). If wind speed is used as the single covariate, it appears to 
overstate the variance; the addition of the second covariate r(t) in this case 
seems to tend to make the estimated variance smaller and bring the In 
average predicted variance in a bin closer to the In average y 2 in the bin. 
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Preliminary examination of In average y 2 in bins and In average model 
variances in bins for the Cauchy models suggests that the Cauchy models 
result in little or no improvement over the results of the normal model. The 
results of the Cauchy models will not be reported here. 

Another way to assess goodness of fit and to cross validate is to evaluate 
the ln-likelihood for the different models at the parameter estimates. Larger 
values of the ln-likelihood suggest better model fit; cf. Cox and Hinkley [1974]. 

Table 2 presents the values of the ln-likelihood up to addition and 
multiplication of constants for the parameter estimates of Table 1; the 
function being evaluated is 

2 = -wa- £ y? ex p{"« + *;/?}• ( 7 ) 

i=l i=l 

where x^fi = The values of 2 are presented for data DA (respectively 

/ 

DB) using the parameters fit using DA (respectively >DB); these are values 
assessing goodness of fit; since maximum likelihood is the estimation 
procedure, the largest value of 2 in each of these two rows is the one 
corresponding to the two-variate model. Values of 2 are also presented for 
data DA (respectively DB) using the parameters fit using DB (respectively 
DA); these are values assessing cross-validation. The underlined value in 
each row is the maximum value in that row; the corresponding model 
provides the best model fit. The bold italicized value in each row is the 
maximum value for the two one-variate models; the corresponding one- 
variate model provides the best model fit between the two one-variate 
models. 
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TABLE 1. NORMAL MODELS 
PARAMETER ESTIMATES 
OBSERVED WIND COVARIATES 

One-Variate Models Two-Variate Models 



Pressure 


Wind 


Data 




r(f) 




s(f) 


In MSE = a+pir(t)+/J ?s(t) 


Height 


Comp. 


Set 


a 


0 


a 


P 


a 


A 


Pi 


850 


U 


A 


2.02 


0.054 


1.94 


0.050 


1.70 


0.040 


0.040 






B 


2.09 


0.050 


1.76 


0.066 


1.63 


0.027 


0.058 






ALL 


2.06 


0.052 


1.85 


0.058 


1.66 


0.034 


0.049 


850 


V 


A 


2.19 


0.040 


1.59 


0.080 


1.51 


0.015 


0.076 






B 


2.05 


0.051 


1.68 


0.071 


1.56 


0.028 


0.062 






ALL 


2.12 


0.045 


1.64 


0.076 


1.53 


0.022 


0.069 


500 


u 


A 


2.29 


0.045 


2.45 


0.018 


2.11 


0.040 


0.011 






B 


2.18 


0.054 


2.19 


0.029 


1.84 


0.046 


0.020 






ALL 


2.23 


0.050 


2.32 


0.024 


1.97 


0.043 


0.015 


500 


V 


A 


2.31 


0.039 


2.27 


0.023 


1.99 


0.033 


0.018 






B 


2.24 


0.042 


2.14 


0.028 


1.89 


0.034 


0.021 






ALL 


2.28 


0.041 


2.21 


0.025 


1.94 


0.034 • 


0.019 


250 


u 


A 


3.12 


0.034 


2.48 


0.034 


2.22 


0.023 


0.030 






B 


2.95 


0.039 


2.45 


0.032 


2.17 


0.026 


0.027 






ALL 


3.04 


0.036 


2.46 


0.033 


2.20 


0.024 


0.028 


250 


V 


A 


3.01 


0.031 


2.36 


0.033 


2.13 


0.021 


0.029 






B 


2.97 


0.032 


2.28 


0.035 


2.12 


0.021 


0.028 






ALL 


2.98 


0.031 


2.31 


0.034 


2.12 


0.021 


0.029 



r(t) = [((u(t) - w(f-l))2 + (v(t) - v(t- 1))2)] 1/2 
s(t) = [u(t)2 + vm 1/2 



NOTE: Data are divided into two sets randomly without regard to data 

values. One set is called A; the other is called B. 
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TABLE 2. NORMAL MODELS 
VALUES OF LN-LIKELIHOOD 
OBSERVED WIND COVARIATES 

One-Variate Two 



Pressure 

Height 


Wind 

Comp. 


Data Set 


Model 


Constant 


Models 
fit) sit) 


Variate 

Models 


850 


U 


A 


A 


-7695.9 


-7 596.5 


-7591.6 


-7538.1 






B 


B 


-7746.9 


-7661.8 


-7560.5 


-7540.3 






B 


A 


-7747.5 


-7663.9 


-7571.1 


-7553.2 






A 


B 


-7696.4 


-7598.5 


-7 601.9 


-7551.7 


850 


V 


A 


A 


-7759.1 


-7703.4 


-7505.5 


-7498.5 






B 


B 


-7707.6 


-7614.0 


-7512.8 


-7489.4 






B 


A 


-7708.2 


-7620.6 


-7515.4 


-7497.5 






A 


B 


-7759.7 


-7710.3 


-7508.2 


-7507.4 


500 


u 


A 


A 


-9454.3 


-9314.6 


-9405.3 


-9299.2 






B 


B 


-9518.7 


-9296.8 


-9376.7 


-9239.6 






B 


A 


-9519.5 


-9303.2 


-9397,9 . 


-9257.4 






A 


B 


-9455.2 


-9320.4 


-9425.2 


-9315.8 


500 


V 


A 


A 


-9317.7 


-9317.1 


-9243.0 


-9174.9 






B 


B 


-9258.3 


-9140.7 


-9161.1 


-9091.7 






B 


A 


-9259.1 


-9142.8 


-9165.1 


-9094.2 






A 


B 


-9318.4 


-9219.2 


-9247.5 


-9177.5 


250 


u 


A 


A 


-11265.4 


-10657.4 


-10431.5 


-10344.8 






B 


B 


-10782.7 


-10389.8 


-10249.0 


-10149.2 






B 


A 


-10829.6 


-10403.9 


-10261.7 


-10162.1 






A 


B 


-11319.3 


-10673.4 


-10445.1 


-10358.6 


250 


V 


A 


A 


-10417.8 


-10259.4 


-10094.9 


-10032.0 






B 


B 


-10783.1 


-10181.5 


-10050.1 


-9960.1 






B 


A 


-10814.9 


-10182.5 


-10051.8 


-9961.3 






A 


B 


-10446.4 


-10260.3 


-10096.3 


-10033.2 
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The models considered are {Yj} are independent normal with mean 0 and 



variance 



of (f ) = e a (Constant variance) 



( 8 ) 



and models (l)-(3). 

The two-variate model (3) maximizes the cross-validation value of £ for 
data DA (respectively DB) with a model using parameters fit using DB 
(respectively DA). This suggests that both r(f) and s(f) together have 
predictive ability. 

For the one-variate models (1) and (2) the cross-validation value of £ for 
DA (respectively DB) using the parameters fit using DB (respectively DA) are 
equally divided as to whether r(f) by itself or s(t) by itself produces the higher 
value of £. This suggests that neither variate by itself has obviously better 
predictive value than the other. The goodness of fit values of £ for the one- 
variate models using DA (respectively DB) have a higher value of £ 
associated with s(t ) the majority of the time. This suggests that s(t) by itself 

provides a better description of the data than r(f) by itself. 

Comparing the value of £, £ c , for DA (respectively DB) using the constant 
variance model (8) fit using DA (respectively DB) with the corresponding 
cross-validation value of £ for DA (respectively DB) using models (2), (3) fit 
using DB (respectively DA) indicates the following. The values of £ for 
models (2) and (3) fit with the other half of the data are larger than the 
corresponding value £ c for the constant variance model fit using the data to 
be modeled. This indicates that both models (2) and (3) fit with the other half 
of the data describe the data better than the best constant variance model (8) fit 
with the same data it is used to summarize. 



9 



3.2 First Guess Wind Covariate Models 

In this section we report the results of using models (1)— (3) and (8) with 
first guess winds as covariates; the two covariates considered are 



'•/(<) = [K«- U/(< ■ -i)f + (v/W- v f (t- 1))" 



1 

2 



and 



s,( t) = {u f (t) 2 + V,(,) 2 ]2. 

The analysis is the same as in the previous subsection. The data sets DA and 
DB are the same as those in the previous subsection in each case. 

The values of the parameter estimates appear in Table 3. The 
corresponding values of £ appear in Table 4. Once again the underlined 
value of £ is the largest value in each row; the bold italicized value £ is the 
largest value between the two one-variate models. 

In all but two cases the values of £ for the observed wind covariates are 
larger than those for the first-guess wind covariates. This suggests that the 
observed wind components have better predictive and descriptive value than 
the first guess wind components. 

Table 4 also indicates the following results concerning models using first 
guess wind covariates. Between the two one-variate models (1) and (2) the 
one-variate model using first guess wind speed always has the greater £- 
value. This suggests that first guess wind speed alone has better predictive 
and descriptive value than rfit) alone. The cross-validation values of £ for 
data DA (respectively DB) using parameters fit with DB (respectively DA) are 
maximized about half the time using the one-variate model with Sf(t). The 
other times the maximal £ is associated with the two-variate model. 
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TABLE 3. NORMAL MODELS 
PARAMETER ESTIMATES 
FIRST GUESS WIND COVARIATES 

One-variate Models Two Variate Models 

Pressure Wind Data rfit) s^t) 



Heights 


Comp. 


Set 


a 


P 


a 


fi 


a 


fix 


A 


850 


u 


A 


2.52 


-0.006 


2.23 


0.025 


2.30 


-0.023 


0.031 






B 


2.48 


0.004 


2.21 


0.029 


2.26 


-0.013 


0.032 






ALL 


2.50 


-0.0007 


2.22 


0.027 


2.28 


-0.017 


0.032 


850 


V 


A 


2.54 


-0.004 


2.34 


0.017 


2.40 


-0.016 


0.021 






B 


2.41 


0.013 


2.27 


0.021 


2.28 


-0.001 


0.022 






ALL 


2.47 


0.005 


2.31 


0.019 


2.34 


-0.008 


0.021 


500 


u 


A 


2.61 


0.023 


2.35 


0.023 


2.25 


0.015 


0.021 






B 


2.68 


0.019 


2.48 


0.017 


2.39 


0.014 


0.016 






ALL 


2.65 


0.021 


2.42 


0.020 


2.31 


0.014 


0.018 


500 


V 


A 


2.71 


0.006 


2.40 


0.018 


2.39 


0.0008 


0.017 






B 


2.76 


-0.002 


2.29 


0.022 


2.35 


-0.009 


0.023 






ALL 


2.73 


0.002 


2.34 


0.020 


2.37 


-0.004 


0.020 


250 


u 


A 


4.02 


-0.005 


2.66 


0.037 


2.67 


-0.002 


0.037 






B 


3.46 


0.021 


3.01 


0.022 


2.78 


0.019 


0.021 






ALL 


3.74 


0.009 


2.79 


0.031 


2.69 


0.008 


0.031 


250 


V 


A 


3.49 


0.007 


3.00 


0.017 


2.93 


0.006 


0.017 






B 


3.53 


0.016 


2.87 


0.026 


2.60 


0.018 


0.026 






ALL 


3.50 


0.012 


2.92 


0.022 


2.75 


0.013 


0.022 
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TABLE 4. NORMAL MODEL 
FIRST GUESS WIND COVARIATES 
VALUE OF LIKELIHOOD 



Pressure 

Height 


Wind 

Comp. 


Data 

Set 


Model 


Constant 


One Variate 
Models 
rfit) Sf{t) 


Two 

Variate 

Models 


850 


u 


A 


A 


-7695.9 


-7695.1 


- 7667.6 


-7657.9 






B 


B 


-7746.9 


-7746.6 


- 7708.8 


-7705.6 






B 


A 


-7747.5 


-7749.5 


- 7709.8 


-7708.8 






A 


B 


-7696.4 


-7697.7 


- 7668.6 


-7660.9 


850 


V 


A 


A 


-7759.1 


-7758.7 


-7745.4 


-7741.1 






B 


B 


-7707.6 


-7703.7 


- 7685.5 


-7685.5 






B 


A 


-7708.2 


-7711.3 


- 7687.3 


-7691.8 






A 


B 


-7759.7 


-7765.3 


- 7747.3 


-7746.7 


500 


u 


A 


A 


-9454.3 


-9433.7 


- 9391.1 


-9383.0 






B 


B 


-9518.7 


-9505.7 


- 9481.9 


-9475.2 






B 


A 


-9519.5 


-9507.5 


- 9486.2 


-9479.1 






A 


B 


-9455.2 


-9435.5 


- 9395.4 


-9387.0 


500 


V 


A 


A 


-9317.7 


-9316.0 


- 9281.2 


-9281.2 






B 


B 


-9258.3 


-9258.2 


- 9202.6 


-9199.5 






B 


A 


-9259.1 


-9261.5 


-9205.6 


-9206.2 






A 


B 


-9318.4 


-9319.7 


- 9284.3 


-9288.4 


250 


u 


A 


A 


-11265.4 


-11263.9 


- 10907.3 


-10907.0 






B 


B 


-10782.7 


-10745.7 


- 10684.5 


-10653.5 






B 


A 


-10829.6 


-10846.5 


-10739.9 


-10745.7 






A 


B 


-11319.3 


-11371.6 


= 10987.2 


-11035.7 


250 


V 


A 


A 


-10417.8 


-10414.2 


- 10349.3 


-10346.7 






B 


B 


-10783.1 


-10758.9 


- 10622.4 


-10587.8 






B 


A 


-10814.9 


-10796.9 


- 10658.7 


-10640.3 






A 


B 


-10446.4 


-10446.4 


- 10379.2 


-10389.1 
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Comparing the value of 2 , 2 C , for DA (respectively DB) using the constant 
variance model (8) fit using DA (respectively DB) with the cross-validation 
value of 2 for DA (respectively DB) using models (2), (3) fit using DB 
(respectively DA) indicates the following. The values of 2 for models (2) and 
(3) fit with the other half of the data are always larger than the corresponding 
value 2 C for the constant variance model fit using the data to be modeled. 
This suggests that both models (2) and (3) fit with the other half of the data 
describe the data somewhat better than the best constant variance model (8) fit 
with the data to be described. 

In summary, based on values of 2 , when first guess winds are used as 
covariates it appears that the one-variate model using first guess wind speed 
is an attractive choice for predictive purposes. When observational winds are 
used as covariates, the two-variate model appears to have the best predictive 
value. 

Assessing goodness of fit and cross-validation using values of 2 has the 
advantage of not being sensitive to binning. However, 2 may be sensitive to 
data sets DA and DB. Further work needs to be done to develop procedures to 
assess goodness of fit and for cross-validation. Procedures based on 
bootstrapping or jackknifing hold some promise. 

4. THE DATA ANALYSIS— APRIL AND FEBRUARY DATA 

In this section we report results of an assessment of goodness of fit for the 
normal models (1)— (3) for April data. We also report results concerning using 
a model whose parameters are fit using February data (respectively April) data 
to model April data (respectively February) data. 
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4.1 Observed Wind Covariate Models 

In this subsection we report results for normal models (1)— (3) using 
observed wind components as covariates. There are six analyses; one for the 
w-wind component (respectively i>-wind component) for each pressure 
height. 

Table 5 shows the values of the parameter estimates for both the February 
data and April data. Table 6 shows the values of £ for February data 
(respectively April data) using parameters fit using February data (respectively 
April data). Values of £ are also presented for February data (respectively 
April data) using parameters fit using April data (respectively February data). 
Once again, larger values of £ indicate better model fit. The underlined value 
in each row is the maximum value in that row. The bold italicized value in 
each row is the maximum value of £ for the two one-variate models. 

The values of £ for February data (respectively April data) using 
parameters fit using April data (respectively February data) are maximized by 
the two-variate model in all but one case; betweeh the two one-variate 
models £ is the maximized half the time for the model involving s(f). 

Comparing the value of £, £ c , for the model of constant variance (8) for 
February (respectively April) data fit using February (respectively April) data 
with that for the prediction value of £ for the models (2)— (3) for February 
(respectively April) data fit using April (respectively February) data indicate 
the following. The values of £ for models (2) and (3) fit with data from the 
other month are always larger than the corresponding values of £ c fit with 
the data of the same month. This suggests that models (2) and (3) fit using 
data from the other month have predictive value over a model of constant 
variance fit using the data that is to be modeled. 
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TABLE 5. NORMAL MODELS 
PARAMETER ESTIMATES 
OBSERVED WIND CO VARIATES 

One-Variate Models Two-Variate Models 



Pressure 

Height 


Wind 

Comp. 


Data 

Set 


a 


r(t) 

f> 


a 


sit) 

P 


In MSE = a+/J 1 r(0+/3 2 s(0 
a A k 


850 


u 


Feb. 


2.06 


0.052 


1.85 


0.058 


1.66 


0.034 


0.049 






Apr- 


1.86 


0.084 


1.69 


0.086 


1.44 


0.053 


0.068 


850 


V 


Feb. 


2.12 


0.045 


1.64 


0.076 


1.53 


0.022 


0.069 






Apr. 


1.83 


0.089 


1.69 


0.090 


1.42 


0.062 


0.065 


500 


u 


Feb. 


2.23 


0.050 


2.32 


0.024 


1.97 


0.043 


0.015 






Apr. 


2.12 


0.058 


2.20 


0.030 


1.80 


0.049 


0.023 


500 


V 


Feb. 


2.28 


0.041 


2.21 


0.024 


1.94 


0.034 


0.019 






Apr. 


2.02 


0.065 


1.97 


0.041 


1.66 


0.052 


0.027 


250 


u 


Feb. 


3.04 


0.036 


2.46 


0.033 


2.20 


0.024 


0.028 






Apr. 


2.77 


0.044 


2.69 


0.027 


2.33 


0.037 


0.018 


250 


V 


Feb. 


2.98 


0.031 


2.31 


0.034 


2.12 


0.021 


0.029 






Apr. 


2.73 


0.041 


2.61 


0.027 


2.26 


0.034 


0.019 



lit) = [((«(f) - u(t- 1))2 + {v{t) - v(t- 1)) 2 )] 172 
s(0 = [«(0 2 + va) 2 l 1/2 
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TABLE 6. NORMAL MODELS 
VALUES OF LN-LIKELIHOOD 
OBSERVED WIND COVARIATES 

One-Variate Two- 

Pressure Wind Models Variate 



Height 


Comp. 


Data Set 


Model 


Constant 


fit) 


s( f) 


Models 


850 


u 


Feb. 


Feb. 


-15443.1 


-15259.3 


-15157.3 


-15084.9 






Apr. 


Apr. 


-15709.1 


-15311.6 


-15169.9 


-15043.6 






Apr. 


Feb. 


-15713.2 


-15368.2 


-15238.7 


-15117.7 






Feb. 


Apr. 


-15447.0 


-15321.2 


-15240.3 


-15171.9 


850 


V 


Feb. 


Feb. 


-15467.0 


-15320.8 


-15019.6 


-14992.1 






Apr. 


Apr. 


-15819.1 


-15320.6 


-15307.2 


-15106.4 






Apr. 


Feb. 


-15827.8 


-15429.9 


-15386.7 


-15251.5 






Feb. 


Apr. 


-15475.3 


-15428.6 


rl51Q5J 


-15121.1 


500 


u 


Feb. 


Feb. 


-18973.4 


-18614.5 


-18792.2 


-18547.4 






Apr. 


Apr. 


-18504.1 


-18083.1 


-18270.7 


-17969.4 






Apr. 


Feb. 


-18528.9 


-18093.4 


-18280.6 


-17989.1 






Feb. 


Apr. 


-18999.9 


-18625.3 


-18804.3 


-18576.6 


500 


V 


Feb. 


Feb. 


-18576.4 


-18358.8 


-18406.2 


-18267.9 






Apr. 


Apr. 


-18698.2 


-17869.0 


-18014.4 


-17733.0 






Apr. 


Feb. 


-18699.0 


-17938.0 


-18086.9 


-17796.2 






Feb. 


Apr. 


-18577.1 


-18421.3 


-18480.2 


-18345.2 


250 


u 


Feb. 


Feb. 


-22073.2 


-21054.7 


-20687.1 


-20500.6 






Apr. 


Apr. 


-22712.2 


-20364.3 


-20658.3 


-20195.9 






Apr. 


Feb. 


-22712.4 


-20439.9 


-20703.3 


-20276.4 






Feb. 


Apr. 


-22073.4 


-21127.5 


-20726.0 


-20591.9 


250 


V 


Feb. 


Feb. 


-21216.0 


-20441.4 


-20145.7 


-19992.6 






Apr. 


Apr. 


-21205.7 


-20006.9 


-20274.8 


-19840.6 






Apr. 


Feb. 


-21240.2 


-20065.6 


-20338.7 


-19924.9 






Feb. 


Apr. 


-21252.5 


-20498.8 


-20190.2 


-20070.5 
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4.2 First Guess Wind Covariate Models 

In this section we report results for normal models (1)— (3) using first 
guess wind components as covariates. 

Table 7 shows the values of the parameter estimates for both February 
data and April data. Table 8 shows the values of i for February data 
(respectively April data) using parameters fit using February data (respectively 
April data). Values of i are also presented for February data (respectively 
April data) using parameters fit using April data (respectively February data). 
The underlined value in each row is the maximum value in that row. The 
bold italicized value in each row is the maximum value of i for the two one- 
variate models. 

The values of £ for the observed wind covariates are larger than those for 
the first guess wind covariates except for two values associated with the one- 
variate model using s(f) to model w-wind component error at the 250 mb 
height for the model using parameters fit with the same data. This suggests 
that the observed wind covariates provide a better model of the data both in 
terms of goodness-of-fit and prediction. 

The values of ( for February data (respectively April data) using 
parameters fit using April data (respectively February data) are maximized 
about half the time by the two-variate model and the other half the time by 
the one-variate model using the first guess wind speed s(f). 
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TABLE 7. NORMAL MODELS 
PARAMETER ESTIMATES 
FIRST GUESS WIND COVARIATES 

One-variate Models Two-Variate Models 

Pressure Wind Data rfit) Sf(t) 



Heights 


Comp. 


Set 


a 


P 


a 


P 


a 


h 


ft 


850 


u 


Feb. 


2.50 


-0.0007 


2.22 


0.027 


2.28 


-0.017 


0.032 






Apr. 


2.42 


0.022 


2.19 


0.041 


2.19 


-0.002 


0.041 


850 


V 


Feb. 


2.47 


0.005 


2.31 


0.019 


2.34 


-0.008 


0.021 






Apr. 


2.46 


0.019 


2.23 


0.039 


2.24 


-0.004 


0.040 


500 


u 


Feb. 


2.65 


0.021 


2.41 


0.020 


2.32 


0.014 


0.018 






Apr. 


2.50 


0.031 


2.23 


0.030 


2.11 


0.019 


0.028 


500 


V 


Feb. 


2.73 


0.002 


2.34 


0.020 


2.37 


-0.004 


0.020 






Apr. 


2.30 


0.061 


1.98 


0.045 


1.74 


0.040 


0.040 


250 


u 


Feb. 


3.74 


0.009 


2.79 


0.031 


2.69 


.0-008 


0.031 






Apr. 


4.00 


-0.010 


3.48 


0.014 


3.63 


-0.021 


0.016 


250 


V 


Feb. 


3.50 


0.012 


2.92 


0.022 


2.75 


0.013 


0.022 






Apr. 


3.26 


0.025 


2.80 


0.026 


2.68 


0.015 


0.024 
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TABLE 8. NORMAL MODEL 
VALUE OF LN-LIKELIHOOD 
FIRST GUESS WIND COVARIATES 



One-Variate Two- 



Pressure 


Wind 


Data 






Models 


Variate 


Height 


Comp. 


Set 


Model 


Constant 


rft) 


Sfit) 


Models 


850 


u 


Feb. 


Feb. 


-15443.1 


-15443.0 


-15376.0 


-15365.0 






Apr. 


Apr. 


-15709.1 


-15695.3 


-15593.3 


-15593.1 






Apr. 


Feb. 


-15713.2 


-15714.0 


-15620.6 


-15624.2 






Feb 


Apr. 


-15447.0 


-15471.3 


-15413. 2 


-15409.9 


850 


V 


Feb. 


Feb. 


-15467.0 


-15466.0 


-15431.8 


-15429.5 






Apr. 


Apr. 


-15819.1 


-15808.8 


-15716.0 


-15715.7 






Apr. 


Feb. 


-15827.8 


-15824.2 


-15759.4 


-15757.7 






Feb. 


Apr. 


-15475.3 


-15846.6 


-15493.2 


-15490.5 


500 


u 


Feb. 


Feb. 


-18973.4 


-18940.3 


-18875.1 


-18860.2 






Apr. 


Apr. 


-18504.1 


-18455.5 


-18331.3 


-18312.9 






Apr. 


Feb. 


-18528.9 


-18473.2 


-18351.3 


-18332.6 






Feb. 


Apr. 


-18999.9 


-18956.6 


-18897.5 


-18885.5 


500 


V 


Feb. 


Feb. 


-18576.4 


-18575.9' 


-18485.3 


-18484.1 






Apr. 


Apr. 


-18698.2 


-18509.1 


-18289.2 


-18208.2 






Apr. 


Feb. 


-18699.0 


-18683.4 


-18423.3 


-18441.4 






Feb. 


Apr. 


-18577.1 


-18805.8 


-18656.5 


-18778.6 


250 


u 


Feb. 


Feb. 


-22073.2 


-22061.3 


-21624.3 


-21613.7 






Apr. 


Apr. 


-22712.2 


-22699.3 


-22641.2 


-22609.5 






Apr. 


Feb. 


-22712.4 


-22739.2 


-22906.5 


-22987.4 






Feb. 


Apr. 


-22073.4 


-22139.0 


- 21792 , 3 . 


-21863.5 


250 


V 


Feb. 


Feb. 


-21216.0 


-21190.5 


-20988.0 


-20957.8 






Apr. 


Apr. 


-21205.7 


-21142.9 


-20919.7 


-20899.6 






Apr. 


Feb. 


-21240.2 


-21181.4 


-20927.1 


-20902.2 






Feb. 


Apr. 


-21252.5 


-21232.3 


-20994.4 


-20961.4 
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A comparison of the value of ^ / 2c, for the constant variance model of 
February (respectively April) data fit using the same month February 
(respectively April) data and the prediction values of 2 for models 
(l)-(3) of February (respectively April) data fit using the other month of April 
(respectively February) indicate the following. A little fewer than half the 
time 2 c is smaller than the corresponding values of 2 for models (1)— (3) fit 
with the other month's data. This suggests that the first-guess wind speed 
models fit using the other month's data may not describe the data as well as a 
constant variance model fit using the data being modeled. This may be an 
indication that models fit using first-guess February wind (respectively April 
wind) data are not good predictors of April (respectively February) wind 
component error. 

4.3 Conclusions 

Models (2) and (3) using observed wind components as covariates and fit 
using February (respectively April) data appear to have predictive value for 
April (respectively February) data. It is less clear if models (1)— (3) using first- 
guess wind components as covariates and fit using February (respectively 
April) data have predictive value for April (respectively February) wind 
component error data. It might be that models (1)— (3) fit with first-guess data 
from other Aprils (respectively Februarys) are better predictors of April 
(respectively February) wind component error. Alternatively, if first-guess 
winds are to be used as predictors, it might be worthwhile to develop a 
procedure to update the fitted model parameters using new data as it comes 
in. 
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APPENDIX A 



MAXIMUM LIKELIHOOD ESTIMATION FOR THE NORMAL MODEL 



Let Y\, Y'l, Y n be independent normal random variables with mean 0 
and variances 



Oj = exp 



« + Y x ijPj 



s exp{a + *,/?} i = 1, ... ,n (A.l) 



where (xn, ...,Xi p ) are fixed explanatory variables associated with V,-. 

The likelihood function for this model is 

L(a r fry) = I J -^=exp|- ^(a + *;£) jexpj - j yj exp{-(a + (A.2) 

Hence, the ln-likelihood function is 



4«'£;y) = \ 



■na - - £y?exp{ -(oc +*•£)} 



i= 1 i=l 



- n— ln2n. (A.3) 

2 



Computing partial derivatives of t with respect to a and fy results in 

n 



to - 2 



■n+^yfex p{-(a +*,•£)} 

i=l 



(A. 4) 



dp 



■K a 'P'l) m \ 



tt n 



L X ij + Yy 2 i ex p{* ( a + ZiP)} x i 



i=l j=1 



Setting t = 0 results in the equation 

= ^£y? ex p{ -£;£}• 



i=l 



(A. 5) 



(A. 6) 
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Setting 



_ a _ 

eft 



t = 0 and replacing e a by (A. 6) yields the equation 




-x 



n 

,1 

i-i 






•*,£ + 



n 

i 

/— i 



2 liP 
yfe 



(A .7) 



where x 



; 



l r n 



Further, 



■ + */I *- x * ■ £ y> ; 



2=1 



1=1 



2 .'-i- 
e XijX ik . 



(A.8) 



If/*(£) = 0, then 



_ " 2 -xf ” 2 -x_iP 

2=1 1=1 



(A.9) 



Substituting (A.9) into (A.8) yields 




* £i -(*i)*iifc- */**)• 



(A. 10) 



An iteration of a Newton procedure to solve the system of equations 0 = fj(P), 
( f = 1, ..., p) yields the system of linear equations 




(A.11) 




' x ) ) ‘ Z Zy?* [*i)*i* * x j x k \(Pk - Pk) 

fc=li=l 



(A.12) 



where J3° is the current value for /3. This system of linear equations is solved 
for {/!*}. The Newton procedure is iterated until it converges. The resulting 
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j/3*;} are the maximum likelihood estimates of {/Jjt} and the estimate of a is 
obtained from (A. 6). 

To summarize, the procedure to estimate a and {J3y} is as follows 

1. Start with = 0,j - 1, p. 

2. Solve the linear system of equations (A. 12) for {y3jt} . 



3. 



If max 

) 



M° 






< 0.001 stop and set /3 j = j3y; otherwise set P j - P y and 



return to step 2. 
4. Compute 




p{’£$ 
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850 MB U WIND;MODEL A ON DATA A;FEB OBS WIND 

1 VAR=R[T] = °;2VAR=+;BIN ON R[T] 




Nia uid 3sw av m Nia aaa bsvn av ni 
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LN AV PRED MSE PER BIN 



850 MB U WIND;MODEL B ON DATA B;FEB OBS WIND 

1VAR=R[T] = °;2VAR=+;BIN ON R[T] 







Nia d3d 3SW AV N3 Nia d3d 3SW AV N3 
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LN AV PRED MSE PER BIN 



850 MB U WIND;MODEL B- ON DATA A;FEB OBS WIND 

1 VAR— R[T]=°;2VAR=+;BIN ON R[T] 




Nia d3d 3SW AV NT NI9 d3d 3SW AV N3 

C C V. c 
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LN AV PRED MSE PER BIN 



850 MB U WIND;MODEL A ON DATA B;FEB OBS WIND 

1 VAR=R[T]=°;2VAR=+;BIN ON R[T] 




Nia d3d 3SVN AV N1 NIG d3d 3SW AV N3 

C ^ L 
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LN AV PRED MSE PER BIN 



850 MB V WIND;MODEL A ON DATA A;FEB OBS WIND 

1 VAR=R[T] = °;2VAR= + ;BIN ON R[T] 




* £ Z L f £ Z l 

NIG G3d 3SW AV N1 NIG G3d 3SH AV N1 



i * 
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Figure 5 



850 MB V WIND;M0DEL B ON DATA B;FEB OBS WIND 

1 VAR=R[T]=°;2VAR=+;BIN ON R[T] 







Nia aad 3sw av ni Nia H3d 3sw av ni 
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LN AV PRED MSE PER BIN 



850 MB V WIND;MODEL B ON DATA A; FEB OBS WIND 

1VAR=R[T] = °;2VAR=+;BIN ON R[T] 




31 



Figure 7 



850 MB V WIND;M0DEL A ON DATA B;FEB OBS WIND 

1 VAR=R[T]=°;2VAR= + ;BIN ON R[T] 




NI9 83d 3SM AV NT NIG d3d 3SVN AV N3 

i ( < 



32 



LN AV PRED MSE PER BIN 



500 MB U WIND;MODEL A ON DATA A;FEB OBS WIND 

1 VAR=R[T] = °;2VAR= + ;BIN ON R[T] 
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U1 
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OL 
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Figure 9 



500 MB U WIND;MODEL B ON DATA B;FEB OBS WIND 

1 VAR=R[T]=°;2VAR=+;BIN ON R[T] 




Nia B3d 3SIN AV N3 NI9 d3d 3SW AV N3 
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LN AV PRED MSE PER BIN 

l 



500 MB U WIND;MODEL B ON DATA A;FEB OBS WIND 

1 VAR=R[T]=°;2VAR=+;BIN ON R[T] 




Nia d3d 3SW AV N3 Nia d3d 3SAI av Nl 
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Figure 11 



500 MB U WIND;MODEL A ON DATA B;FEB OBS WIND 
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APPENDIX B. A GRAPHICAL ASSESSMENT OF GOODNESS OF FIT AND 
CROSS-VALIDATION OF MODELS OF FEBRUARY WIND COMPONENT 
MEAN SQUARE ERROR USING FIRST-GUESS WIND COVARIATES 

In this appendix we present figures assessing goodness of fit and cross- 
validation of the normal models (1)— (3) with first-guess wind covariates fit to 
February data. As in subsection (3.2) the data is randomly divided into two 
sets called DA and DB without regard to the values of the data; these sets are 
the same as those in that section. 

The maximum likelihood parameter estimates for each model (1)— (3) are 

obtained for each set DA and DB and appear in Table 3. The estimated 
2 2 2 

variances o^l ,t), cr 2 (f) are computed for the parameters estimated 

from DA and DB using (l)-(3) for each data point in DA and DB. 

To assess models (1) and (3) the data (y(f), r(t), s(t )) are binned into 10 bins 
based on ordering the values of r(t) from smallest to largest. The data in the 
first bin correspond to the smaller values of r(f); the data in the 10 ,A bin 

1 th 

correspond to the larger values of r(t). Each bin contains about of the data 

with the lO** bin containing a few more data. The averages of the estimated 

2 

variances for models (1) and (3) are computed for each bin. The average y(f) 
is also computed for each bin. 

To assess models (2) and (3) the same procedure is used but the binning is 
based on values of s(f). 

Figures 1B-24B present graphs of the ln[average y(f) ] in each bin versus 
ln[average estimated variance] in each bin for models (1) and (3) and models 
(2) and (3). Figures IB, 5B, 9B, 13B, 17B, 21B (respectively 2B, 6B, 10B, 14B, 18B 
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2 

22B) show the logarithm of the average of the y(f) values of DA (respectively 
DB) versus the logarithm of the average of the estimated variances for each 
bin using the estimated parameters from DA (respectively DB). If a model 
were perfect, a point should be close to the 45° line shown. These figures 
assess goodness of fit. 

Figures 3B, 7B, 11B, 15B, 19B, 23B (respectively 4B, 8B, 12B, 16B, 20B, 24B) 

2 

present graphs of In average y(t) of DA (respectively DB) versus In average 
estimated variances using parameters estimated using data DB (respectively 
DA). Once again if the model were perfect, the points would be close to the 
45° line. 

As suggested by the values of the ln-likelihood t in Tables 2 and 4, the 
figures for models using first guess covariates indicate weaker goodness of fit 
and weaker cross-validation than Figures 1-24 for models with observed wind 
speed covariates. Both goodness-of-fit and cross-validation appear to 
improve somewhat for higher pressure height levels; Figures 17B-24B. This 
suggests that models using first guess covariates have greater predictive and 
descriptive value at 250mb height levels. However, they appear to be not as 
good as models using observed wind speed as covariates. 
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APPENDIX C GRAPHICAL ASSESSMENT OF GOODNESS OF FIT AND 
CROSS-VALIDATION OF MODELS FOR FEBRUARY AND APRIL WIND 
COMPONENT MEAN SQUARE ERROR USING OBSERVED WIND 

COVARIATES 

In this appendix we present graphs assessing goodness of fit and 
predictive ability of the normal models (1)— (3) with observed wind covariates 
fit to April and February data. 

The maximum likelihood parameter estimates for each model (1)— (3) are 

obtained for both February and April data and are displayed in Table 5. The 

2 2 2 

estimated variances <7^(1, t), a 1 (2 ,t), ct 2 (0 are computed for the parameters 
estimated from February and April data using (l)-(3) for each data point in 
February and April. 

To assess models (1) and (3) the data ( y(t ), r(t), s(t)), for each- data set are 

binned into 10 bins based on ordering the values of r(f) from smallest to 

largest. The data in the first bin correspond to the smaller values of r(f); the 

data in the lO'* bin correspond to the larger values of r(t). Each bin contains 
1 

about iQ of the data with the 10* bin containing a few more. The averages of 

the estimated variances for models (1) and (3) are computed for each bin. The 
2 

average y(0 is also computed for each bin. 

To assess models (2) and (3) the same procedure is used but the binning is 
done using s(t). 

Figures 1C-24C present graphs of the ln[average y(D ] in each bin versus 



ln[average estimated variance] in each bin for models (1) and (3) and models 
(2) and (3). Figures 1C, 5C, 9C, 13C, 17C, 21C (respectively 2C, 6C/10C, 14C, 18C 
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\ \i 

4 






i 

i • 



I 



2 

22C) show the logarithm of the average of the y(f) values for February 
(respectively April) versus the logarithm of the average of the estimated 
variances for each bin using the estimated parameters from February 
(respectively April). If a model were perfect, a point should be close to the 45° 
line shown. These figures assess goodness of fit. 

Figures 3C, 7C, 11C, 15C, 19C, 23C (respectively 4C, 8C, 12C, 16C, 20C, 24C) 
present graphs of In average y(t) of February (respectively April) versus In 
average estimated variances using parameters estimated using April 
(respectively February) data. Once again if the model were perfect, the points 
would be close to the 45° line. These figures assess the ability of models fit 
using February (respectively April) observed data to predict April 
(respectively February) wind component mean square error. 

The figures indicate once again that the display of In averages can be quite 
sensitive to which variate is used to do the binning. . . 

i * t 

Keeping this binning sensitivity in mind, the figures suggest the 
following. The two-variate model (3) appears to best describe and predict the 
mean square component wind error. Of the two one-variable models, model 
(1) which uses r(0 as the covariate appears to be better. The one-variate 
model using s(f) appears to tend to overstate the predicted mean square error. 
The addition of the second covariate r(f) to the one-variate model using s(f) 
appears to tend to decrease the predicted mean square error. 
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APPENDIX D. GRAPHICAL ASSESSMENT OF GOODNESS OF FIT AND 
CROSS-VALIDATION OF MODELS FOR FEBRUARY AND APRIL WIND 
COMPONENT MEAN SQUARE ERROR USING FIRST-GUESS WIND 

COVARIATES 

In this appendix we present graphs assessing goodness of fit and 

predictive ability of the normal models (l)-(3) with first-guess wind 

covariates fit to April and February data. 

The maximum likelihood parameter estimates for each model (1)— (3) are 

obtained for both February and April data and are displayed in Table 7. The 

2 2 2 

estimated variances 0^(1, t), a 2 ( t) are computed for the parameters 

estimated from February and April data using (l)-(3) for each data point in 

t T 

February and April. 

To assess models (1) and (3) the data (y(f), r(t), s(t )) for each data set are 

binned into 10 bins based on ordering the values of r(f) from smallest to 

largest. The data in the first bin correspond to the smaller values of r(f); the 

data in the lO** bin correspond to the larger values of r(t). Each bin contains 

about Yq of the data with the 10* bin containing a few more. The averages of 

the estimated variances for models (1) and (3) are computed for each bin. The 
2 

average y(f) is also computed for each bin. 

To assess models (2) and (3) the same procedure is used but the binning is 
done using s(f). 

Figures 1D-24D present graphs of the ln[average y(f) ] in each bin versus 
ln[average estimated variance] in each bin for models (1) and (3) and models 
(2) and (3). Figures ID, 5D, 9D, 13D, 17D, 21D (respectively 2D, 6D, 10D, 14D, 
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18D 22D) show the logarithm of the average of the y(tf values for February 
(respectively April) versus the logarithm of the average of the estimated 
variance for each bin using the estimated parameters from February 
(respectively April). If a model were perfect, a point should be close to the 45° 
line shown. These figures are an indication of goodness of fit. 

Figures 3D, 7D, 11D, 15D, 19D, 23D (respectively 4D, 8D, 12D, 16D, 20D, 
24D) present graphs of In average y(f) of February (respectively April) versus 
In average estimated variances using parameters estimated using April 
(respectively February) data. Once again if the model were perfect, the points 
would be close to the 45° line. These figures assess the ability of models fit 
using February (respectively April) first-guess data to predict April 
(respectively February) wind component mean square error. 

The figures indicate once again that the display of In averages can be quite 
sensitive to which variate is used to do the binning. 

The figures indicate the following. As suggested by comparison of the In 
likelihood values, 2, of Tables 6 and 8 for models with observed wind 
covariates and first guess wind covariates, the figures suggest that models 
using first guess wind covariates do not describe or predict mean square error 
for wind components as well as models using observed wind components. 
The two-variate model appears to tend to produce smaller mean square errors 
than the one-variate models; this tendency is most striking in the figure with 
first guess wind speed being used as the single covariate. 

The models fit using April first guess data appear to tend to be better 
descriptive and predictive models than those fit using February first guess 
data. 
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The figures indicating predictive ability (3D, 4D, 7 D, 8D, 11D, 15D, 19D, 
20D, 23D and 24D) correspond fairly well to the differences between the 
minimizing value of 1 for the models with covariates and the value of i for 
the constant model (no covariates) in the corresponding rows of Table 8. If 
the value of i for the constant model is larger than any other values in the 
row, the corresponding figure for that row shows no association. 
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